Appendix: City Block Distance and Rough-Fuzzy Clustering for Iden- tification of Co-Expressed microRNAs†
نویسندگان
چکیده
In this section, the performance of the proposed roughfuzzy clustering algorithm1 is compared with that of hard c-means (HCM)2, fuzzy c-means (FCM)3, rough-fuzzy cmeans (RFCM)4, cluster identification via connectivity kernels (CLICK)5, and self organizing map (SOM)6 with respect to gene ontology. The performance of the normalized rangenormalized city block distance (NRNCBD) over Pearson distance (PD) and Euclidean distance (ED) is also presented. The genes that are targeted by at least 75% miRNAs in a cluster are analyzed and the results are reported in Fig. 1. The final annotation ratios generated by all the algorithms at their optimum values of λ and ω for molecular functions (MF), biological processes (BP), and cellular components (CC) ontologies on four miRNA microarray data sets are shown in this figure. All the results reported here confirm that the proposed clustering algorithm provides higher or comparable final annotation ratios than that obtained using several existing clustering algorithms in most of the cases. The upper portion of Fig. 1 presents the comparative results of the RFCM and proposed clustering algorithm, in terms of final annotation ratio or cluster frequency, for the MF, BP, and CC ontologies on four miRNA expression data sets. All the results reported here confirm that the proposed method provides higher or comparable final annotation ratios than that obtained using the RFCM algorithm in most of the cases. Out of 12 cases, the proposed method provides higher final annotation ratio in 7 cases. On the other hand, the RFCM with the NRNCBD generates better results in 1, 2, and 2 cases for MF, BP, and CC ontologies, respectively. The middle portion of Fig. 1 reports the comparative final annotation ratio of the HCM, FCM, and the proposed algorithm on four data sets. From the results reported in this portion, it is seen that out of total 12 comparisons, the proposed algorithm attains higher final annotation ratio than that ob-
منابع مشابه
New distance and similarity measures for hesitant fuzzy soft sets
The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملA Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset
Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...
متن کاملFuzzy soft rough K-Means clustering approach for gene expression data
Clustering is one of the widely used data mining techniques for medical diagnosis. Clustering can be considered as the most important unsupervised learning technique. Most of the clustering methods group data based on distance and few methods cluster data based on similarity. The clustering algorithms classify gene expression data into clusters and the functionally related genes are grouped tog...
متن کامل